Improving TTS Synthesis for Emotional Expressivity by a Prosodic Parameterization of Affect based on Linguistic Analysis
نویسندگان
چکیده
Affective Speech Synthesis is quite important for various applications like storytelling, speech based user interfaces, computer games, etc. However, some studies revealed that Text-To-Speech (TTS) systems have tendency for not conveying a suitable emotional expressivity in their outputs. Due to the recent convergence of several analytical studies pertaining to affect and human speech, this problem can now be tackled by a new angle that has at its core an appropriate prosodic parameterization based on an intelligent detection of the affective clues of the input text. This, allied with recent findings on affective speech analysis, allows a suitable assignment of pitch accents, other prosodic parameters and signal properties that adhere to F0 and match the optimal parameterization for the emotion detected in the input text. Such approach allows the input text to be enriched with metainformation that assists efficiently the TTS system. Furthermore, the output of the TTS system is also postprocessed in order to enhance its affective content. Several preliminary tests confirm the validity of our approach and encourage us to continue its exploration.
منابع مشابه
How to improve TTS systems for emotional expressivity
Several experiments have been carried out that revealed weaknesses of the current Text-To-Speech (TTS) systems in their emotional expressivity. Although some TTS systems allow XML-based representations of prosodic and/or phonetic variables, few publications considered, as a pre-processing stage, the use of intelligent text processing to detect affective information that can be used to tailor th...
متن کاملAffective story teller: a TTS system for emotional expressivity
Some Text-to-Speech (TTS) systems revealed weaknesses in their emotional expressivity but this situation can be improved by a better parameterization of the acoustic and prosodic parameters. This paper describes a system, Affective Story Teller (AST), as an example of emotionally expressive speech synthesizer. Our technique uses several linguistic resources that recognizes emotions in the input...
متن کاملEmotional FESTIVAL-MBROLA TTS synthesis
The topic of this work is an extension of our previous research on the development of a general data-driven procedure for creating a neutral “narrative-style” prosodic module for the Italian FESTIVAL Text-To-Speech (TTS) synthesizer, and it is focused on investigating and implementing new strategies for building a new emotional FESTIVAL TTS. The new emotional prosodic modules, similarly to the ...
متن کاملAutomatic prosodic modeling for speaker and task adaptation in text-to-speech
One of the most important demands for future TTS systems is their ability to improve naturalness when embedded in a particular task or application that requires a particular speaking style for a particular speaker. In this paper, we present a new prosodic modeling procedure for improving naturalness by adapting a TTS system to a new speaker and a new speaking style. The proposed procedure is an...
متن کاملDecision tree micro-prosody structures for text to speech synthesis
This paper explores the use of micro-prosody in improving the quality of synthesised speech in concatenated text to speech synthesis (TTS) systems. Micro-prosody are defined as prosodic signals within context-dependent triphone units and across neighbouring triphones. Micro-prosody parameters are modelled using a Markovian model whose state distributions depend on the current linguistic-prosodi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009